Expanding vocabulary for recognizing user's abbreviations of proper nouns without increasing ASR error rates in spoken dialogue systems

نویسندگان

Masaki Katsumaru

Kazunori Komatani

Tetsuya Ogata

Hiroshi G. Okuno

چکیده

Users often abbreviate long words when using spoken dialogue systems, which results in automatic speech recognition (ASR) errors. We define abbreviated words as sub-words of the original word, and add them into an ASR dictionary. The first problem is that proper nouns cannot be correctly segmented by general morphological analyzers, although long and compounded words need to be segmented in agglutinative languages such as Japanese. The second is that, as vocabulary increases, adding many abbreviated words degrades the ASR accuracy. We develop two methods, (1) to segment words by using conjunction probabilities between characters, and (2) to manipulate occurrence probabilities of generated abbreviated words on the basis of the phonological similarities between abbreviated and original words. By our method, the ASR accuracy is improved by 24.2 points for utterances containing abbreviated words, and degraded by only a 0.1 point for those containing original words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adjusting Occurrence Probabilities of Automatically-Generated Abbreviated Words in Spoken Dialogue Systems

Users often abbreviate long words when using spoken dialogue systems, which results in automatic speech recognition (ASR) errors. We define abbreviated words as sub-words of an original word and add them to the ASR dictionary. The first problem we face is that proper nouns cannot be correctly segmented by general morphological analyzers, although long and compound words need to be segmented in ...

متن کامل

Combined Systems for Automatic Phonetic Transcription of Proper Nouns

Large vocabulary automatic speech recognition (ASR) technologies perform well in known, controlled contexts. However recognition of proper nouns is commonly considered as a difficult task. Accurate phonetic transcription of a proper noun is difficult to obtain, although it can be one of the most important resources for a recognition system. In this article, we propose methods of automatic phone...

متن کامل

Stochastic Language Adaptation over Time andState in Natural Spoken Dialogue

| We are interested in adaptive spoken dialogue systems for automated services. Peoples' spoken language usage varies over time for a given task, and furthermore varies depending on the state of the dialogue. Thus, it is crucial to adapt ASR language models to these varying conditions. We characterize and quantify these variations based on a database of 30K user-transactions with AT&T's experim...

متن کامل

Analyzing temporal transition of real user's behaviors in a spoken dialogue system

Managing various behaviors of real users is indispensable for spoken dialogue systems to operate adequately in real environments. We have analyzed various users’ behaviors using data collected over 34 months from the Kyoto City Bus Information System. We focused on “barge-in” and added barge-in rates to our analysis. Temporal transitions of users’ behaviors, such as automatic speech recognition...

متن کامل

Basic speech recognition for spoken dialogues

Spoken dialogue systems (SDSs) have great potential for information access in the developing world. However, the realisation of that potential requires the solution of several challenging problems, including the development of sufficiently accurate speech recognisers for a diverse multitude of languages. We investigate the feasibility of developing smallvocabulary speaker-independent ASR system...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Expanding vocabulary for recognizing user's abbreviations of proper nouns without increasing ASR error rates in spoken dialogue systems

نویسندگان

چکیده

منابع مشابه

Adjusting Occurrence Probabilities of Automatically-Generated Abbreviated Words in Spoken Dialogue Systems

Combined Systems for Automatic Phonetic Transcription of Proper Nouns

Stochastic Language Adaptation over Time andState in Natural Spoken Dialogue

Analyzing temporal transition of real user's behaviors in a spoken dialogue system

Basic speech recognition for spoken dialogues

عنوان ژورنال:

اشتراک گذاری